Goto

Collaborating Authors

 different source



Accurate Uncertainty Estimation and Decomposition in Ensemble Learning

Neural Information Processing Systems

Ensemble learning is a standard approach to building machine learning systems that capture complex phenomena in real-world data. An important aspect of these systems is the complete and valid quantification of model uncertainty. We introduce a Bayesian nonparametric ensemble (BNE) approach that augments an existing ensemble model to account for different sources of model uncertainty.


unclear points and will update the paper accordingly in the final version. 2 To Reviewer # 1. 1. Architectures for generators and discriminators. We adopt the generator and discriminator

Neural Information Processing Systems

We sincerely thank all the reviewers for their insightful comments to help us improve the paper. T o Reviewer #2. 1. Are multiple sources more beneficial? This is largely due to the fact that domain gap also exists among different source domains. We will reorganize the layout of Figure 1 in the main paper to make it more clear. We thank the reviewer for pointing this out.


Accurate Uncertainty Estimation and Decomposition in Ensemble Learning

Neural Information Processing Systems

Ensemble learning is a standard approach to building machine learning systems that capture complex phenomena in real-world data. An important aspect of these systems is the complete and valid quantification of model uncertainty. We introduce a Bayesian nonparametric ensemble (BNE) approach that augments an existing ensemble model to account for different sources of model uncertainty. It has a theoretical guarantee in that it robustly estimates the uncertainty patterns in the data distribution, and can decompose its overall predictive uncertainty into distinct components that are due to different sources of noise and error. We show that our method achieves accurate uncertainty estimates under complex observational noise, and illustrate its real-world utility in terms of uncertainty decomposition and model bias detection for an ensemble in predict air pollution exposures in Eastern Massachusetts, USA.


Towards Holistic Disease Risk Prediction using Small Language Models

arXiv.org Artificial Intelligence

Data in the healthcare domain arise from a variety of sources and modalities, such as x-ray images, continuous measurements, and clinical notes. Medical practitioners integrate these diverse data types daily to make informed and accurate decisions. With recent advancements in language models capable of handling multimodal data, it is a logical progression to apply these models to the healthcare sector. In this work, we introduce a framework that connects small language models to multiple data sources, aiming to predict the risk of various diseases simultaneously. Our experiments encompass 12 different tasks within a multitask learning setup. Although our approach does not surpass state-of-the-art methods specialized for single tasks, it demonstrates competitive performance and underscores the potential of small language models for multimodal reasoning in healthcare.


Comparison analysis between standard polysomnographic data and in-ear-EEG signals: A preliminary study

arXiv.org Artificial Intelligence

Study Objectives: Polysomnography (PSG) currently serves as the benchmark for evaluating sleep disorders. Its discomfort, impracticality for home-use, and introduction of bias in sleep quality assessment necessitate the exploration of less invasive, cost-effective, and portable alternatives. One promising contender is the in-ear-EEG sensor, which offers advantages in terms of comfort, fixed electrode positions, resistance to electromagnetic interference, and user-friendliness. This study aims to establish a methodology to assess the similarity between the in-ear-EEG signal and standard PSG. Methods: We assess the agreement between the PSG and in-ear-EEG derived hypnograms. We extract features in the time- and frequency- domain from PSG and in-ear-EEG 30-second epochs. We only consider the epochs where the PSG-scorers and the in-ear-EEG-scorers were in agreement. We introduce a methodology to quantify the similarity between PSG derivations and the single-channel in-ear-EEG. The approach relies on a comparison of distributions of selected features -- extracted for each sleep stage and subject on both PSG and the in-ear-EEG signals -- via a Jensen-Shannon Divergence Feature-based Similarity Index (JSD-FSI). Results: We found a high intra-scorer variability, mainly due to the uncertainty the scorers had in evaluating the in-ear-EEG signals. We show that the similarity between PSG and in-ear-EEG signals is high (JSD-FSI: 0.61 +/- 0.06 in awake, 0.60 +/- 0.07 in NREM and 0.51 +/- 0.08 in REM), and in line with the similarity values computed independently on standard PSG-channel-combinations. Conclusions: In-ear-EEG is a valuable solution for home-based sleep monitoring, however further studies with a larger and more heterogeneous dataset are needed.


One Microphone Source Separation

Neural Information Processing Systems

Source separation, or computational auditory scene analysis, attempts to extract individual acoustic objects from input which contains a mixture of sounds from different sources, altered by the acoustic environment. Unmixing algorithms such as lCA and its extensions recover sources by reweighting multiple obser(cid:173) vation sequences, and thus cannot operate when only a single observation signal is available. I present a technique called refiltering which recovers sources by a nonstationary reweighting ("masking") of frequency sub-bands from a single recording, and argue for the application of statistical algorithms to learning this masking function . I present results of a simple factorial HMM system which learns on recordings of single speakers and can then separate mixtures using only one observation signal by computing the masking function and then refiltering. If each pianist were striking keys randomly it would be very difficult to tell which note came from which piano.


Automatically Identifying Relations Between Self-Admitted Technical Debt Across Different Sources

arXiv.org Artificial Intelligence

Self-Admitted Technical Debt or SATD can be found in various sources, such as source code comments, commit messages, issue tracking systems, and pull requests. Previous research has established the existence of relations between SATD items in different sources; such relations can be useful for investigating and improving SATD management. However, there is currently a lack of approaches for automatically detecting these SATD relations. To address this, we proposed and evaluated approaches for automatically identifying SATD relations across different sources. Our findings show that our approach outperforms baseline approaches by a large margin, achieving an average F1-score of 0.829 in identifying relations between SATD items. Moreover, we explored the characteristics of SATD relations in 103 open-source projects and describe nine major cases in which related SATD is documented in a second source, and give a quantitative overview of 26 kinds of relations.


A health telemonitoring platform based on data integration from different sources

arXiv.org Artificial Intelligence

The management of people with long-term or chronic illness is one of the biggest challenges for national health systems. In fact, these diseases are among the leading causes of hospitalization, especially for the elderly, and huge amount of resources required to monitor them leads to problems with sustainability of the healthcare systems. The increasing diffusion of portable devices and new connectivity technologies allows the implementation of telemonitoring system capable of providing support to health care providers and lighten the burden on hospitals and clinics. In this paper, we present the implementation of a telemonitoring platform for healthcare, designed to capture several types of physiological health parameters from different consumer mobile and custom devices. Consumer medical devices can be integrated into the platform via the Google Fit ecosystem that supports hundreds of devices, while custom devices can directly interact with the platform with standard communication protocols. The platform is designed to process the acquired data using machine learning algorithms, and to provide patients and physicians the physiological health parameters with a user-friendly, comprehensive, and easy to understand dashboard which monitors the parameters through time. Preliminary usability tests show a good user satisfaction in terms of functionality and usefulness.


Complete Deep Learning In R With Keras & Others

#artificialintelligence

This course covers the main aspects of neural networks and deep learning. If you take this course, you can do away with taking other courses or buying books on R based data science. In this age of big data, companies across the globe use R to sift through the avalanche of information at their disposal. By becoming proficient in neural networks and deep learning in R, you can give your company a competitive edge and boost your career to the next level! My name is Minerva Singh and I am an Oxford University MPhil (Geography and Environment) graduate.